Cosine distribution (continuous)#
A compact-support, symmetric distribution on an interval of length \(2\pi\) with density proportional to \(1+\cos(x)\). It’s a simple, smooth alternative to a uniform distribution when you want more mass in the middle and zero density at the endpoints.
In this notebook you’ll:
Define the PDF/CDF (including location/scale form)
Derive key moments (mean/variance) and the likelihood
Implement NumPy-only sampling (accept–reject)
Visualize PDF, CDF, and Monte Carlo samples
Use
scipy.stats.cosinefor practical work
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from scipy import stats
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(42)
1) Title & Classification#
Name: cosine distribution
Type: continuous
Standard support: \(x \in [-\pi,\,\pi]\)
Standard parameters: none (fixed shape)
SciPy exposes it as a location–scale family:
Parameters:
loc = \mu \in \mathbb{R},scale = \sigma > 0Support: \(x \in [\mu-\pi\sigma,\,\mu+\pi\sigma]\)
2) Intuition & Motivation#
What it models#
The cosine distribution is a bounded, symmetric, unimodal distribution with density that smoothly decreases from the center to the edges.
Think of it as a “bumped” uniform distribution on \([-\pi,\pi]\):
Uniform: constant density on the interval.
Cosine: density \(\propto 1+\cos(x)\), so it peaks at \(x=0\) and hits 0 at \(\pm\pi\).
Typical use cases#
A simple model or prior for a quantity constrained to an interval (especially an angle difference mapped to \([-\pi,\pi]\)) with central tendency.
A pedagogical example of a finite-support distribution with closed-form CDF and tractable moments.
Lightweight generative modeling on bounded domains (possibly as a mixture component).
Relations to other distributions#
Location–scale family of the standard cosine distribution.
Closely related to the raised cosine shape (common in signal processing).
Via a transformation, it connects to the Wigner semicircle distribution: if \(U\) has semicircle density on \([-1,1]\), then \(X = 2rcsin(U)\) follows the cosine distribution.
3) Formal Definition#
PDF#
Standard form:
and \(f(x)=0\) otherwise.
Location–scale form (SciPy’s parameterization):
CDF#
For the standard form:
For \((\mu,\sigma)\): \(F(x\mid\mu,\sigma) = F_0\left(rac{x-\mu}{\sigma} ight)\) where \(F_0\) is the standard CDF above.
PI = np.pi
def cosine_support(loc: float = 0.0, scale: float = 1.0) -> tuple[float, float]:
if scale <= 0:
raise ValueError("scale must be > 0")
return loc - PI * scale, loc + PI * scale
def cosine_pdf(x, loc: float = 0.0, scale: float = 1.0):
x = np.asarray(x)
if scale <= 0:
raise ValueError("scale must be > 0")
z = (x - loc) / scale
inside = (-PI <= z) & (z <= PI)
out = np.zeros_like(z, dtype=float)
out[inside] = (1.0 + np.cos(z[inside])) / (2.0 * PI * scale)
return out
def cosine_cdf(x, loc: float = 0.0, scale: float = 1.0):
x = np.asarray(x)
if scale <= 0:
raise ValueError("scale must be > 0")
z = (x - loc) / scale
out = np.empty_like(z, dtype=float)
out[z < -PI] = 0.0
out[z > PI] = 1.0
inside = (-PI <= z) & (z <= PI)
out[inside] = (z[inside] + np.sin(z[inside]) + PI) / (2.0 * PI)
return out
def cosine_logpdf(x, loc: float = 0.0, scale: float = 1.0):
x = np.asarray(x)
if scale <= 0:
raise ValueError("scale must be > 0")
z = (x - loc) / scale
inside = (-PI <= z) & (z <= PI)
cosz = np.cos(z)
cosz = np.clip(cosz, -1.0, 1.0)
out = np.full_like(z, -np.inf, dtype=float)
with np.errstate(divide="ignore"):
out[inside] = np.log1p(cosz[inside]) - np.log(2.0 * PI * scale)
return out
# Quick sanity checks
x_grid = np.linspace(-PI, PI, 20001)
pdf_grid = cosine_pdf(x_grid)
mass_trapz = np.trapz(pdf_grid, x_grid)
print("Numerical integral of PDF over support:", mass_trapz)
# Match SciPy
max_abs_pdf_err = np.max(np.abs(pdf_grid - stats.cosine.pdf(x_grid)))
max_abs_cdf_err = np.max(np.abs(cosine_cdf(x_grid) - stats.cosine.cdf(x_grid)))
print("max |pdf - scipy.pdf|:", max_abs_pdf_err)
print("max |cdf - scipy.cdf|:", max_abs_cdf_err)
Numerical integral of PDF over support: 1.0
max |pdf - scipy.pdf|: 5.551115123125783e-17
max |cdf - scipy.cdf|: 1.1102230246251565e-16
4) Moments & Properties#
Let \(X\) denote the standard cosine distribution on \([-\pi,\pi]\).
Mean, variance, skewness, kurtosis#
By symmetry, \(\mathbb{E}[X]=0\).
The variance is
Skewness is 0 (symmetric distribution).
The fourth moment is
so the (excess) kurtosis is
For the location–scale version \(Y = \mu + \sigma X\):
\(\mathbb{E}[Y]=\mu\)
\(\operatorname{Var}(Y)=\sigma^2\left(rac{\pi^2}{3}-2 ight)\)
Skewness and kurtosis are unchanged by location–scale.
MGF / characteristic function#
The MGF exists for all real \(t\) (bounded support):
The characteristic function is
with removable singularities at \(\omega\in\{0,\pm 1\}\).
Entropy#
The differential entropy (in nats) for the standard form is
For \(Y = \mu + \sigma X\), \(h(Y)=h(X)+\ln\sigma\).
# Closed-form moment constants (standard form)
var0 = PI**2 / 3.0 - 2.0
m4 = PI**4 / 5.0 - 4.0 * PI**2 + 24.0
kurtosis_excess0 = m4 / (var0**2) - 3.0
entropy0 = np.log(4.0 * PI) - 1.0
print("Var[X] =", var0)
print("Excess kurtosis =", kurtosis_excess0)
print("Entropy (nats) =", entropy0)
# Monte Carlo verification
n = 400_000
x_mc = stats.cosine.rvs(size=n, random_state=rng)
mean_mc = x_mc.mean()
var_mc = x_mc.var()
m4_mc = np.mean((x_mc - mean_mc) ** 4)
kurtosis_excess_mc = m4_mc / (var_mc**2) - 3.0
print()
print("Monte Carlo (n=%d)" % n)
print("mean ~", mean_mc)
print("var ~", var_mc)
print("excess kurtosis ~", kurtosis_excess_mc)
Var[X] = 1.2898681336964528
Excess kurtosis = -0.5937628755982818
Entropy (nats) = 1.5310242469692907
Monte Carlo (n=400000)
mean ~ -0.0005897480997648315
var ~ 1.2884274555095643
excess kurtosis ~ -0.5946739650596489
5) Parameter Interpretation#
SciPy’s cosine is the standard distribution with location and scale:
loc = \mu: shifts the center (mean/mode/median) to \(\mu\).scale = \sigma: stretches the support from \([-\pi,\pi]\) to \([\mu-\pi\sigma,\mu+\pi\sigma]\) and rescales the height by \(1/\sigma\).
There are no additional shape parameters: the overall “cosine bump” shape is fixed.
# Visualize the effect of loc/scale
params = [
dict(loc=0.0, scale=1.0, name="loc=0, scale=1"),
dict(loc=0.0, scale=0.5, name="loc=0, scale=0.5"),
dict(loc=0.0, scale=2.0, name="loc=0, scale=2"),
dict(loc=1.0, scale=1.0, name="loc=1, scale=1"),
]
fig = go.Figure()
for p in params:
lo, hi = cosine_support(p["loc"], p["scale"])
x = np.linspace(lo, hi, 2000)
y = cosine_pdf(x, p["loc"], p["scale"])
fig.add_trace(go.Scatter(x=x, y=y, mode="lines", name=p["name"]))
fig.update_layout(
title="Cosine PDF for different (loc, scale)",
xaxis_title="x",
yaxis_title="density",
)
fig.show()
6) Derivations#
Expectation (standard form)#
Because \(f(x)\) is even (symmetric about 0), \(x f(x)\) is odd, so
Variance (standard form)#
Since \(\mathbb{E}[X]=0\), \(\operatorname{Var}(X)=\mathbb{E}[X^2]\):
Compute the two terms:
\(\int_0^\pi x^2 dx = \pi^3/3\).
By parts: \(\int_0^\pi x^2\cos x\,dx = -2\pi\).
Therefore
Likelihood (location–scale)#
For i.i.d. data \(x_1,\dots,x_n\) under \((\mu,\sigma)\):
subject to the support constraint \(x_i\in[\mu-\pi\sigma,\mu+\pi\sigma]\) for all \(i\).
There is no simple closed-form MLE; numerical optimization (as in SciPy’s fit) is typical.
def cosine_loglike(data, loc: float, scale: float) -> float:
return float(np.sum(cosine_logpdf(np.asarray(data), loc=loc, scale=scale)))
# Example: simulate data and evaluate the log-likelihood surface around the truth
true_loc, true_scale = 0.5, 1.2
x = stats.cosine.rvs(loc=true_loc, scale=true_scale, size=500, random_state=rng)
loc_grid = np.linspace(true_loc - 0.6, true_loc + 0.6, 60)
scale_grid = np.linspace(0.6, 2.0, 70)
ll = np.empty((len(scale_grid), len(loc_grid)))
for i, s in enumerate(scale_grid):
for j, m in enumerate(loc_grid):
ll[i, j] = cosine_loglike(x, loc=m, scale=s)
# Plot as a heatmap
fig = go.Figure(
data=go.Heatmap(
z=ll,
x=loc_grid,
y=scale_grid,
colorscale="Viridis",
colorbar=dict(title="log-likelihood"),
)
)
fig.add_trace(
go.Scatter(
x=[true_loc],
y=[true_scale],
mode="markers",
marker=dict(color="red", size=10),
name="true (loc, scale)",
)
)
fig.update_layout(title="Log-likelihood surface (example)", xaxis_title="loc", yaxis_title="scale")
fig.show()
7) Sampling & Simulation (NumPy-only)#
Accept–reject sampling#
We can sample from the standard cosine distribution using a uniform proposal:
Proposal: \(Z \sim \mathrm{Unif}[-\pi,\pi]\) with density \(g(z)=1/(2\pi)\).
Target: \(f(z)=(1+\cos z)/(2\pi)\).
The ratio is
so we can take \(M=2\) and accept with probability
This yields an average acceptance rate of \(1/M = 50\%\).
To sample the location–scale version, return \(X = \mu + \sigma Z\).
def cosine_rvs_numpy(
n: int,
*,
loc: float = 0.0,
scale: float = 1.0,
rng: np.random.Generator | None = None,
) -> tuple[np.ndarray, float]:
"""Sample from the cosine distribution using NumPy-only accept-reject.
Returns (samples, acceptance_rate).
"""
if n < 0:
raise ValueError("n must be >= 0")
if scale <= 0:
raise ValueError("scale must be > 0")
if rng is None:
rng = np.random.default_rng()
out = np.empty(n, dtype=float)
filled = 0
proposals = 0
accepted_total = 0
# Expected acceptance rate is exactly 0.5 for the Uniform(-pi, pi) proposal.
while filled < n:
remaining = n - filled
batch = int(max(256, 2.2 * remaining))
z = rng.uniform(-PI, PI, size=batch)
u = rng.random(size=batch)
accept = u <= (1.0 + np.cos(z)) / 2.0
accepted = z[accept]
proposals += batch
accepted_total += len(accepted)
k = min(len(accepted), remaining)
out[filled : filled + k] = accepted[:k]
filled += k
# Apply location-scale transform
out = loc + scale * out
acceptance_rate = accepted_total / proposals
return out, acceptance_rate
samples, acc_rate = cosine_rvs_numpy(200_000, rng=rng)
print("acceptance rate ~", acc_rate)
print("sample mean ~", samples.mean())
acceptance rate ~ 0.5006909090909091
sample mean ~ -0.0017187970087170262
8) Visualization#
We’ll plot:
Theoretical PDF and CDF
Histogram of Monte Carlo samples with the theoretical PDF overlay
Empirical CDF vs theoretical CDF
# PDF and CDF (standard)
x = np.linspace(-PI, PI, 2000)
fig_pdf = go.Figure()
fig_pdf.add_trace(go.Scatter(x=x, y=cosine_pdf(x), mode="lines", name="PDF"))
fig_pdf.update_layout(title="Cosine PDF (standard)", xaxis_title="x", yaxis_title="density")
fig_pdf.show()
fig_cdf = go.Figure()
fig_cdf.add_trace(go.Scatter(x=x, y=cosine_cdf(x), mode="lines", name="CDF"))
fig_cdf.update_layout(title="Cosine CDF (standard)", xaxis_title="x", yaxis_title="probability")
fig_cdf.show()
# Histogram + overlay
fig_hist = px.histogram(samples, nbins=80, histnorm="probability density", title="Monte Carlo samples")
fig_hist.add_trace(go.Scatter(x=x, y=cosine_pdf(x), mode="lines", name="theoretical PDF"))
fig_hist.update_layout(xaxis_title="x", yaxis_title="density")
fig_hist.show()
# Empirical CDF vs theoretical CDF
xs = np.sort(samples)
ys = np.arange(1, len(xs) + 1) / len(xs)
fig_ecdf = go.Figure()
fig_ecdf.add_trace(go.Scatter(x=xs, y=ys, mode="lines", name="empirical CDF"))
fig_ecdf.add_trace(go.Scatter(x=x, y=cosine_cdf(x), mode="lines", name="theoretical CDF"))
fig_ecdf.update_layout(title="Empirical vs theoretical CDF", xaxis_title="x", yaxis_title="probability")
fig_ecdf.show()
9) SciPy Integration (scipy.stats.cosine)#
SciPy provides a ready-to-use implementation with the usual methods:
.pdf(x, loc, scale)/.logpdf(...).cdf(x, loc, scale).rvs(size, loc, scale, random_state=...).fit(data)(estimateslocandscale)
Below is a small end-to-end example.
dist = stats.cosine
# Generate synthetic data
true_loc, true_scale = -0.3, 0.8
x = dist.rvs(loc=true_loc, scale=true_scale, size=2000, random_state=rng)
# Fit loc/scale
loc_hat, scale_hat = dist.fit(x)
print("true loc, scale:", (true_loc, true_scale))
print("fit loc, scale:", (loc_hat, scale_hat))
# Compare fitted PDF to histogram
lo, hi = cosine_support(loc_hat, scale_hat)
x_plot = np.linspace(lo, hi, 2000)
fig = px.histogram(x, nbins=60, histnorm="probability density", title="Fit with scipy.stats.cosine")
fig.add_trace(go.Scatter(x=x_plot, y=dist.pdf(x_plot, loc=loc_hat, scale=scale_hat), mode="lines", name="fitted PDF"))
fig.add_trace(go.Scatter(x=x_plot, y=dist.pdf(x_plot, loc=true_loc, scale=true_scale), mode="lines", name="true PDF"))
fig.update_layout(xaxis_title="x", yaxis_title="density")
fig.show()
true loc, scale: (-0.3, 0.8)
fit loc, scale: (-0.2959937908105679, 0.7922314131197986)
10) Statistical Use Cases#
Hypothesis testing#
Goodness-of-fit: test whether data are consistent with a cosine model (often after fitting
loc/scale).Model comparison: compare cosine vs uniform or other bounded distributions using likelihood-based criteria.
Below are two small examples:
A KS test when parameters are known.
A simple moment-based test that distinguishes uniform vs cosine on \([-\pi,\pi]\) using \(\mathbb{E}[\cos X]\).
Bayesian modeling#
Use the cosine distribution as a bounded, smooth prior/likelihood component when you want:
support on a finite interval
a single smooth mode
zero density at endpoints
Generative modeling#
As a base distribution for bounded variables (possibly with mixtures):
This can approximate a variety of unimodal/multimodal shapes on bounded domains.
# 1) KS test with known parameters (valid when parameters are fixed a priori)
loc0, scale0 = 0.2, 1.1
x0 = stats.cosine.rvs(loc=loc0, scale=scale0, size=800, random_state=rng)
# Test against the exact CDF
D, p = stats.kstest(x0, "cosine", args=(loc0, scale0))
print("KS statistic:", D)
print("p-value :", p)
# 2) Uniform vs cosine on [-pi, pi] via E[cos X]
# Under Uniform(-pi,pi): E[cos X]=0 and Var(cos X)=1/2
# Under Cosine: E[cos X]=1/2
x_u = rng.uniform(-PI, PI, size=2000)
def z_test_uniform_vs_cosine(x):
m = np.mean(np.cos(x))
se = np.sqrt(0.5 / len(x))
return m / se
z_u = z_test_uniform_vs_cosine(x_u)
z_c = z_test_uniform_vs_cosine(stats.cosine.rvs(size=2000, random_state=rng))
print()
print("Z under uniform (should be ~N(0,1)):", z_u)
print("Z under cosine (should be large +):", z_c)
KS statistic: 0.03356901014089397
p-value : 0.3209829769831638
Z under uniform (should be ~N(0,1)): 0.7131579341389477
Z under cosine (should be large +): 31.419646281119494
11) Pitfalls#
Invalid parameters:
scalemust be strictly positive.Support matters: if any observation lies outside \([\mu-\pi\sigma,\mu+\pi\sigma]\), the likelihood is 0 (log-likelihood \(=-\infty\)).
Near-endpoint numerics: when \((x-\mu)/\sigma pprox \pm\pi\), then \(1+\cos(\cdot)pprox 0\) and
logpdfwill be very negative (as it should). Uselog1p(cos(z))for stability.Angle data: if your variable is truly circular, you may want a wrapped/circular distribution (e.g. von Mises). The cosine distribution is on a line segment, not inherently periodic.
12) Summary#
The cosine distribution is a continuous, bounded, symmetric density on \([-\pi,\pi]\) with PDF \(\propto 1+\cos(x)\).
It has closed-form CDF, tractable moments, and a neat accept–reject sampler with a 50% acceptance rate.
In practice,
scipy.stats.cosinehandles evaluation, sampling, and fitting via(loc, scale).
References
SciPy:
scipy.stats.cosine“Raised cosine” distributions/shapes in signal processing texts (shape relation)